Refactoring postgres_fdw/connection.c

  • Jump to comment-1
    masao.fujii@oss.nttdata.com2022-07-25T15:54:47+00:00
    Hi, When reviewing the postgres_fdw parallel-abort patch [1], I found that there are several duplicate codes in postgres_fdw/connection.c. Which seems to make it harder to review the patch changing connection.c. So I'd like to remove such duplicate codes and refactor the functions in connection.c. I attached the following three patches. There are two functions, pgfdw_get_result() and pgfdw_get_cleanup_result(), to get a query result. They have almost the same code, call PQisBusy(), WaitLatchOrSocket(), PQconsumeInput() and PQgetResult() in the loop, but only pgfdw_get_cleanup_result() allows its callers to specify the timeout. 0001 patch transforms pgfdw_get_cleanup_result() to the common function to get a query result and makes pgfdw_get_result() use it instead of its own (duplicate) code. The patch also renames pgfdw_get_cleanup_result() to pgfdw_get_result_timed(). pgfdw_xact_callback() and pgfdw_subxact_callback() have similar codes to issue COMMIT or RELEASE SAVEPOINT commands. 0002 patch adds the common function, pgfdw_exec_pre_commit(), for that purpose, and changes those functions so that they use the common one. pgfdw_finish_pre_commit_cleanup() and pgfdw_finish_pre_subcommit_cleanup() have similar codes to wait for the results of COMMIT or RELEASE SAVEPOINT commands. 0003 patch adds the common function, pgfdw_finish_pre_commit(), for that purpose, and replaces those functions with the common one. That is, pgfdw_finish_pre_commit_cleanup() and pgfdw_finish_pre_subcommit_cleanup() are no longer necessary and 0003 patch removes them. [1] https://commitfest.postgresql.org/38/3392/ Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
    • Jump to comment-1
      etsuro.fujita@gmail.com2022-07-26T10:26:20+00:00
      Fujii-san, On Tue, Jul 26, 2022 at 12:55 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > When reviewing the postgres_fdw parallel-abort patch [1], I found that > there are several duplicate codes in postgres_fdw/connection.c. > Which seems to make it harder to review the patch changing connection.c. > So I'd like to remove such duplicate codes and refactor the functions > in connection.c. I attached the following three patches. > > There are two functions, pgfdw_get_result() and pgfdw_get_cleanup_result(), > to get a query result. They have almost the same code, call PQisBusy(), > WaitLatchOrSocket(), PQconsumeInput() and PQgetResult() in the loop, > but only pgfdw_get_cleanup_result() allows its callers to specify the timeout. > 0001 patch transforms pgfdw_get_cleanup_result() to the common function > to get a query result and makes pgfdw_get_result() use it instead of > its own (duplicate) code. The patch also renames pgfdw_get_cleanup_result() > to pgfdw_get_result_timed(). > > pgfdw_xact_callback() and pgfdw_subxact_callback() have similar codes to > issue COMMIT or RELEASE SAVEPOINT commands. 0002 patch adds the common function, > pgfdw_exec_pre_commit(), for that purpose, and changes those functions > so that they use the common one. > > pgfdw_finish_pre_commit_cleanup() and pgfdw_finish_pre_subcommit_cleanup() > have similar codes to wait for the results of COMMIT or RELEASE SAVEPOINT commands. > 0003 patch adds the common function, pgfdw_finish_pre_commit(), for that purpose, > and replaces those functions with the common one. > That is, pgfdw_finish_pre_commit_cleanup() and pgfdw_finish_pre_subcommit_cleanup() > are no longer necessary and 0003 patch removes them. > > [1] https://commitfest.postgresql.org/38/3392/ Thanks for working on this! I'd like to review this after the end of the current CF. Could you add this to the next CF? Best regards, Etsuro Fujita
      • Jump to comment-1
        masao.fujii@oss.nttdata.com2022-07-26T10:46:52+00:00
        On 2022/07/26 19:26, Etsuro Fujita wrote: > Thanks for working on this! I'd like to review this after the end of > the current CF. Thanks! > Could you add this to the next CF? Yes. https://commitfest.postgresql.org/39/3782/ Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
        • Jump to comment-1
          etsuro.fujita@gmail.com2022-07-27T05:30:14+00:00
          On Tue, Jul 26, 2022 at 7:46 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > On 2022/07/26 19:26, Etsuro Fujita wrote: > > Could you add this to the next CF? > > Yes. > https://commitfest.postgresql.org/39/3782/ Thanks! Best regards, Etsuro Fujita
    • Jump to comment-1
      horikyota.ntt@gmail.com2022-07-26T07:25:27+00:00
      At Tue, 26 Jul 2022 00:54:47 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > Hi, > > When reviewing the postgres_fdw parallel-abort patch [1], I found that > there are several duplicate codes in postgres_fdw/connection.c. > Which seems to make it harder to review the patch changing > connection.c. > So I'd like to remove such duplicate codes and refactor the functions > in connection.c. I attached the following three patches. > > There are two functions, pgfdw_get_result() and > pgfdw_get_cleanup_result(), > to get a query result. They have almost the same code, call > PQisBusy(), > WaitLatchOrSocket(), PQconsumeInput() and PQgetResult() in the loop, > but only pgfdw_get_cleanup_result() allows its callers to specify the > timeout. > 0001 patch transforms pgfdw_get_cleanup_result() to the common > function > to get a query result and makes pgfdw_get_result() use it instead of > its own (duplicate) code. The patch also renames > pgfdw_get_cleanup_result() > to pgfdw_get_result_timed(). Agree to that refactoring. And it looks fine to me. > pgfdw_xact_callback() and pgfdw_subxact_callback() have similar codes > to > issue COMMIT or RELEASE SAVEPOINT commands. 0002 patch adds the common > function, > pgfdw_exec_pre_commit(), for that purpose, and changes those functions > so that they use the common one. I'm not sure the two are similar with each other. The new function pgfdw_exec_pre_commit() looks like a merger of two isolated code paths intended to share a seven-line codelet. I feel the code gets a bit harder to understand after the change. I mildly oppose to this part. > pgfdw_finish_pre_commit_cleanup() and > pgfdw_finish_pre_subcommit_cleanup() > have similar codes to wait for the results of COMMIT or RELEASE > SAVEPOINT commands. > 0003 patch adds the common function, pgfdw_finish_pre_commit(), for > that purpose, > and replaces those functions with the common one. > That is, pgfdw_finish_pre_commit_cleanup() and > pgfdw_finish_pre_subcommit_cleanup() > are no longer necessary and 0003 patch removes them. It gives the same feeling with 0002. Considering that pending_deallocate becomes non-NIL only when toplevel, 38 lines out of 66 lines of the function are the toplevel-dedicated stuff. > [1] https://commitfest.postgresql.org/38/3392/ regards. -- Kyotaro Horiguchi NTT Open Source Software Center
      • Jump to comment-1
        masao.fujii@oss.nttdata.com2022-07-26T09:33:04+00:00
        On 2022/07/26 16:25, Kyotaro Horiguchi wrote: > Agree to that refactoring. And it looks fine to me. Thanks for reviewing the patches! > I'm not sure the two are similar with each other. The new function > pgfdw_exec_pre_commit() looks like a merger of two isolated code paths > intended to share a seven-line codelet. I feel the code gets a bit > harder to understand after the change. I mildly oppose to this part. If so, we can pgfdw_exec_pre_commit() into two, one is the common function that sends or executes the command (i.e., calls do_sql_command_begin() or do_sql_command()), and another is the function only for toplevel. The latter function calls the common function and then executes DEALLOCATE ALL things. But this is not the way that other functions like pgfdw_abort_cleanup() is implemented. Those functions have both codes for toplevel and !toplevel (i.e., subxact), and run the processings depending on the argument "toplevel". So I'm thinking that pgfdw_exec_pre_commit() implemented in the same way is better. > It gives the same feeling with 0002. Considering that > pending_deallocate becomes non-NIL only when toplevel, 38 lines out of > 66 lines of the function are the toplevel-dedicated stuff. Same as above. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
        • Jump to comment-1
          horikyota.ntt@gmail.com2022-07-27T01:36:12+00:00
          At Tue, 26 Jul 2022 18:33:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > I'm not sure the two are similar with each other. The new function > > pgfdw_exec_pre_commit() looks like a merger of two isolated code paths > > intended to share a seven-line codelet. I feel the code gets a bit > > harder to understand after the change. I mildly oppose to this part. > > If so, we can pgfdw_exec_pre_commit() into two, one is the common > function that sends or executes the command (i.e., calls > do_sql_command_begin() or do_sql_command()), and another is > the function only for toplevel. The latter function calls > the common function and then executes DEALLOCATE ALL things. > > But this is not the way that other functions like > pgfdw_abort_cleanup() > is implemented. Those functions have both codes for toplevel and > !toplevel (i.e., subxact), and run the processings depending > on the argument "toplevel". So I'm thinking that > pgfdw_exec_pre_commit() implemented in the same way is better. I didn't see it from that viewpoint but I don't think that unconditionally justifies other refactoring. If we merge pgfdw_finish_pre_(sub)?commit_cleanup()s this way, in turn pgfdw_subxact_callback() and pgfdw_xact_callback() are going to be almost identical except event IDs to handle. But I don't think we would want to merge them. A concern on 0002 is that it is hiding the subxact-specific steps from the subxact callback. It would look reasonable if it were called from two or more places for each topleve and !toplevel, but actually it has only one caller for each. So I think that pgfdw_exec_pre_commit should not do that and should be renamed to pgfdw_commit_remote() or something. On the other hand pgfdw_finish_pre_commit() hides toplevel-specific steps from the caller so the same argument holds. Another point that makes me concern about the patch is the new function takes an SQL statement, along with the toplevel flag. I guess the reason is that the command for subxact (RELEASE SAVEPOINT %d) requires the current transaction level. However, the values isobtainable very cheap within the cleanup functions. So I propose to get rid of the parameter "sql" from the two functions. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
          • Jump to comment-1
            masao.fujii@oss.nttdata.com2022-07-28T06:26:42+00:00
            On 2022/07/27 10:36, Kyotaro Horiguchi wrote: > At Tue, 26 Jul 2022 18:33:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in >>> I'm not sure the two are similar with each other. The new function >>> pgfdw_exec_pre_commit() looks like a merger of two isolated code paths >>> intended to share a seven-line codelet. I feel the code gets a bit >>> harder to understand after the change. I mildly oppose to this part. >> >> If so, we can pgfdw_exec_pre_commit() into two, one is the common >> function that sends or executes the command (i.e., calls >> do_sql_command_begin() or do_sql_command()), and another is >> the function only for toplevel. The latter function calls >> the common function and then executes DEALLOCATE ALL things. >> >> But this is not the way that other functions like >> pgfdw_abort_cleanup() >> is implemented. Those functions have both codes for toplevel and >> !toplevel (i.e., subxact), and run the processings depending >> on the argument "toplevel". So I'm thinking that >> pgfdw_exec_pre_commit() implemented in the same way is better. > > I didn't see it from that viewpoint but I don't think that > unconditionally justifies other refactoring. If we merge > pgfdw_finish_pre_(sub)?commit_cleanup()s this way, in turn > pgfdw_subxact_callback() and pgfdw_xact_callback() are going to be > almost identical except event IDs to handle. But I don't think we > would want to merge them. I don't think they are so identical because (as you say) they have to handle different event IDs. So I agree we don't want to merge them. > A concern on 0002 is that it is hiding the subxact-specific steps from > the subxact callback. It would look reasonable if it were called from > two or more places for each topleve and !toplevel, but actually it has > only one caller for each. So I think that pgfdw_exec_pre_commit > should not do that and should be renamed to pgfdw_commit_remote() or > something. On the other hand pgfdw_finish_pre_commit() hides > toplevel-specific steps from the caller so the same argument holds. So you conclusion is to rename pgfdw_exec_pre_commit() to pgfdw_commit_remote() or something? > Another point that makes me concern about the patch is the new > function takes an SQL statement, along with the toplevel flag. I guess > the reason is that the command for subxact (RELEASE SAVEPOINT %d) > requires the current transaction level. However, the values > isobtainable very cheap within the cleanup functions. So I propose to > get rid of the parameter "sql" from the two functions. Yes, that's possible. That is, pgfdw_exec_pre_commit() can construct the query string by executing the following codes, instead of accepting the query as an argument. But one downside of this approach is that the following codes are executed for every remote subtransaction entries. Maybe it's cheap to construct the query string as follows, but I'd like to avoid any unnecessray overhead if possible. So the patch makes the caller, pgfdw_subxact_callback(), construct the query string only once and give it to pgfdw_exec_pre_commit(). curlevel = GetCurrentTransactionNestLevel(); snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel); Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
            • Jump to comment-1
              sulamul@gmail.com2022-08-04T04:11:57+00:00
              On Thu, Jul 28, 2022 at 11:56 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2022/07/27 10:36, Kyotaro Horiguchi wrote: > > At Tue, 26 Jul 2022 18:33:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > >>> I'm not sure the two are similar with each other. The new function > >>> pgfdw_exec_pre_commit() looks like a merger of two isolated code paths > >>> intended to share a seven-line codelet. I feel the code gets a bit > >>> harder to understand after the change. I mildly oppose to this part. > >> > >> If so, we can pgfdw_exec_pre_commit() into two, one is the common > >> function that sends or executes the command (i.e., calls > >> do_sql_command_begin() or do_sql_command()), and another is > >> the function only for toplevel. The latter function calls > >> the common function and then executes DEALLOCATE ALL things. > >> > >> But this is not the way that other functions like > >> pgfdw_abort_cleanup() > >> is implemented. Those functions have both codes for toplevel and > >> !toplevel (i.e., subxact), and run the processings depending > >> on the argument "toplevel". So I'm thinking that > >> pgfdw_exec_pre_commit() implemented in the same way is better. > > > > I didn't see it from that viewpoint but I don't think that > > unconditionally justifies other refactoring. If we merge > > pgfdw_finish_pre_(sub)?commit_cleanup()s this way, in turn > > pgfdw_subxact_callback() and pgfdw_xact_callback() are going to be > > almost identical except event IDs to handle. But I don't think we > > would want to merge them. > > I don't think they are so identical because (as you say) they have to handle different event IDs. So I agree we don't want to merge them. > > > > A concern on 0002 is that it is hiding the subxact-specific steps from > > the subxact callback. It would look reasonable if it were called from > > two or more places for each topleve and !toplevel, but actually it has > > only one caller for each. So I think that pgfdw_exec_pre_commit > > should not do that and should be renamed to pgfdw_commit_remote() or > > something. On the other hand pgfdw_finish_pre_commit() hides > > toplevel-specific steps from the caller so the same argument holds. > > So you conclusion is to rename pgfdw_exec_pre_commit() to pgfdw_commit_remote() or something? > > > > Another point that makes me concern about the patch is the new > > function takes an SQL statement, along with the toplevel flag. I guess > > the reason is that the command for subxact (RELEASE SAVEPOINT %d) > > requires the current transaction level. However, the values > > isobtainable very cheap within the cleanup functions. So I propose to > > get rid of the parameter "sql" from the two functions. > > Yes, that's possible. That is, pgfdw_exec_pre_commit() can construct the query string by executing the following codes, instead of accepting the query as an argument. But one downside of this approach is that the following codes are executed for every remote subtransaction entries. Maybe it's cheap to construct the query string as follows, but I'd like to avoid any unnecessray overhead if possible. So the patch makes the caller, pgfdw_subxact_callback(), construct the query string only once and give it to pgfdw_exec_pre_commit(). > > curlevel = GetCurrentTransactionNestLevel(); > snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel); > Another possibility I can see is that instead of calling pgfdw_exec_pre_commit() (similarly pgfdw_abort_cleanup) for every connection entry, we should call that once from the callback function, and for that we need to move the hash table loop inside that function. The structure of the callback function looks a little fuzzy to me where the same event is checked for every entry of the connection hash table. Instead of simply move that loop should be inside those function (e.g. pgfdw_exec_pre_commit and pgfdw_abort_cleanup), and let called those function called once w.r.t to event and that function should take care of every entry of the connection hash table. The benefit is that we would save a few processing cycles that needed to match events and call the same function for each connection entry. I tried this refactoring in 0004 patch which is not complete, and reattaching other patches too to make CFboat happy. Thoughts? Suggestions? Regards, Amul
            • Jump to comment-1
              horikyota.ntt@gmail.com2022-07-28T07:27:14+00:00
              At Thu, 28 Jul 2022 15:26:42 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2022/07/27 10:36, Kyotaro Horiguchi wrote: > > At Tue, 26 Jul 2022 18:33:04 +0900, Fujii Masao > > <masao.fujii@oss.nttdata.com> wrote in > > I didn't see it from that viewpoint but I don't think that > > unconditionally justifies other refactoring. If we merge > > pgfdw_finish_pre_(sub)?commit_cleanup()s this way, in turn > > pgfdw_subxact_callback() and pgfdw_xact_callback() are going to be > > almost identical except event IDs to handle. But I don't think we > > would want to merge them. > > I don't think they are so identical because (as you say) they have to > handle different event IDs. So I agree we don't want to merge them. The xact_callback and subxact_callback handles different sets of event IDs so they can be merged into one switch(). I don't think there's much difference from merging the functions for xact and subxact into one rountine then calling it with a flag to chose one of the two paths. (Even though less-than-half lines of the fuction are shared..) However, still I don't think they ought to be merged. > > A concern on 0002 is that it is hiding the subxact-specific steps from > > the subxact callback. It would look reasonable if it were called from > > two or more places for each topleve and !toplevel, but actually it has > > only one caller for each. So I think that pgfdw_exec_pre_commit > > should not do that and should be renamed to pgfdw_commit_remote() or > > something. On the other hand pgfdw_finish_pre_commit() hides > > toplevel-specific steps from the caller so the same argument holds. > > So you conclusion is to rename pgfdw_exec_pre_commit() to > pgfdw_commit_remote() or something? And the remote stuff is removed from the function. That being said, I don't mean to fight this no longer since that is rather a matter of taste. > > Another point that makes me concern about the patch is the new > > function takes an SQL statement, along with the toplevel flag. I guess > > the reason is that the command for subxact (RELEASE SAVEPOINT %d) > > requires the current transaction level. However, the values > > isobtainable very cheap within the cleanup functions. So I propose to > > get rid of the parameter "sql" from the two functions. > > Yes, that's possible. That is, pgfdw_exec_pre_commit() can construct > the query string by executing the following codes, instead of > accepting the query as an argument. But one downside of this approach > is that the following codes are executed for every remote > subtransaction entries. Maybe it's cheap to construct the query string > as follows, but I'd like to avoid any unnecessray overhead if > possible. So the patch makes the caller, pgfdw_subxact_callback(), > construct the query string only once and give it to > pgfdw_exec_pre_commit(). > > curlevel = GetCurrentTransactionNestLevel(); > snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel); That *overhead* has been there and I'm not sure how much actual impact it gives on performance (comparing to the surrounding code). But I would choose leaving it open-coded as-is than turning it into a function that need two tightly-bonded parameters passed and that also tightly bonded to the caller via the parameters. ...In other words, the original code doesn't seem to meet the requirement for a function. However, it's okay if you prefer the functions than the open-coded lines based on the above discussion, I'd stop objecting. regards. -- Kyotaro Horiguchi NTT Open Source Software Center